A Greedy Part Assignment Algorithm for Real-time Multi-person 2D Pose Estimation

نویسندگان

  • Srenivas Varadarajan
  • Parual Datta
  • Omesh Tickoo
چکیده

Human pose-estimation in a multi-person image involves detection of various body parts and grouping them into individual person clusters. While the former task is challenging due to mutual occlusions, the combinatorial complexity of the latter task is very high. We propose a greedy part assignment algorithm that exploits the inherent structure of the human body to achieve a lower complexity, compared to any of the prior published works. This is accomplished by (i) reducing the number of partcandidates using the estimated number of people in the image, (ii) doing a greedy sequential assignment of partclasses, following the kinematic chain from head to ankle (iii) doing a greedy assignment of parts in each part-class set, to person-clusters (iv) limiting the candidate person clusters to the most proximal clusters using human anthropometric data and (v) using only a specific subset of pre-assigned parts for establishing pairwise structural constraints. We show that, these steps sparsify the bodyparts relationship graph and reduces the algorithm’s complexity to be linear in the number of candidates of any single part-class. We also propose methods for improving the accuracy of pose-estimation by (i) spawning personclusters from any unassigned significant body part and (ii) suppressing hallucinated parts. On the MPII multi-person pose database, pose-estimation using the proposed method takes only 0.14 seconds per image. We show that, our proposed algorithm, by using a large spatial and structural context, achieves the state-of-the-art accuracy on both MPII and WAF multi-person pose datasets, demonstrating the robustness of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dense 3D face alignment from 2D video for real-time use

To enable real-time, person-independent 3D registration from 2D video, we developed a 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees. From a single 2D image of a person’s face, a dense 3D shape is registered in real time for each frame. The algorithm utilizes a fast cascade regression framework trained on high-resol...

متن کامل

LCR-Net++: Multi-person 2D and 3D Pose Detection in Natural Images

We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images. Key to our approach is the generation and scoring of a number of pose proposals per image, which allows us to predict 2D and 3D poses of multiple people simultaneously. Hence, our approach does not require an approximate localization of the humans for initialization. Our Localization-Classificatio...

متن کامل

Single-Shot Multi-Person 3D Body Pose Estimation From Monocular RGB Input

We propose a new efficient single-shot method for multiperson 3D pose estimation in general scenes from a monocular RGB camera. Our fully convolutional DNN-based approach jointly infers 2D and 3D joint locations on the basis of an extended 3D location map supported by body part associations. This new formulation enables the readout of full body poses at a subset of visible joints without the ne...

متن کامل

Real-time marker-less multi-person 3D pose estimation in RGB-Depth camera networks

This paper proposes a novel system to estimate and track the 3D poses of multiple persons in calibrated RGBDepth camera networks. The multi-view 3D pose of each person is computed by a central node which receives the single-view outcomes from each camera of the network. Each single-view outcome is computed by using a CNN for 2D pose estimation and extending the resulting skeletons to 3D by mean...

متن کامل

Human Body Pose Estimation Using Silhouette Shape Analysis

We describe a system for human body pose estimation from multiple views that is fast and not dependent on a rigid 3D model. We make use of recent work in decomposition of a silhouette into 2D parts. These 2D part primitives are matched across views to build assemblies in 3D. In order to search for the best assembly, we use a likelihood function that integrates information available from multipl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1708.09182  شماره 

صفحات  -

تاریخ انتشار 2017